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Comparing the model forms estimating generalised diameter-height re¬ 
lationships in Tecomella undulata plantations in hot arid region of India 
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Abstract: Four generalised diameter-height equations were developed and compared for pure and even-aged stands of Tecomella undulata 
in hot arid region of Rajasthan State in India. The data used to fit the equations consisted of 1 540 diameter-height observations collected 
from the plots laid out in uniformly stocked stands of varying age and density. The performance of four equations was tested by non-linear 
least squares regression and evaluated using different statistical criteria. Finally, these equations, with the same values of coefficients ob¬ 
tained during the fitting phase, were validated by an independent data set consisting of 854 diameter-height observations. Overall, equa¬ 
tion (4) (Hui and Gadow function) was found to perform best for both the fitting data set as well as validation data set. 
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Introduction 

Tecomella undulata is a tree species found in the Thar Desert of 
northwest and western India. Distribution of T. undulata is re¬ 
stricted to the drier parts of the Arabia, southern Pakistan and 
northwest India up to an elevation of 1200 m. It is a medium 
sized tree that produces quality timber and is the main source of 
timber amongst the indigenous tree species of hot desert regions. 
Its wood is strong, tough and durable. The wood is excellent for 
firewood and charcoal (Anon 1994). T. undulata is an accepted 
tree species in agro-forestry and large population is found in 
agricultural lands. It acts as a soil-binding tree by spreading a 
network of lateral roots on the top surface of the soil and helps in 
stabilizing shifting sand dunes. 

Growth models for many indigenous species in India are not 
yet developed and information about the growth of T. undulata 
trees is rarely available. T. undulata fetches a high price in the 
domestic market where it is extensively used as timber for 
making furniture etc. There is a high demand of T. undulata 
especially by sawmills and other timber and handicraft 
industries. 

Measurements of individual tree diameters and heights are 
commonly applied in most forest inventory situations to obtain 
estimates of growing stock. Diameters can be measured easily at 
low cost but height measurements are time consuming, often 
inaccurate, very difficult to measure in dense plantations. Hence, 
the heights are derived indirectly from the diameters, using a 
known or estimated relationship between diameters and heights 
(Van Laar and Ak?a 1997) and modelling the development of 
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this relationship over time. The relationship between diameters 
and heights may be described using a height regression or a 
bivariate diameter-height distribution. In practice, all the trees in 
the plots are measured for diameter and sub-sampling is done for 
height. The data from trees sampled for height are then used to 
develop a diameter-height regression, while in turn, is used to 
estimate height of other tree in the plot (Arabatizis and Burkhart 
1992). 

A height regression may be derived separately for each stand, 
on the basis of pairs of diameter-height measurements obtained 
during a stand inventory. As this approach is rather time con¬ 
suming and costly, a practical alternative is to develop general¬ 
ised height regressions, which embody certain basic characteris¬ 
tics inherent in all individual height regressions (Prodan 1965; 
Kramer and Akqa 1987; Wenk et al. 1990; Gadow and Hui 1993; 
Prodan et al. 1997; Hradetzky 1999). 

Generalised diameter-height equations are useful tools for 
forest inventory purposes. Commonly, they are also used as im¬ 
portant element of many size class models for simulating the 
development of silvicultural alternatives over time (Pascoa 1987). 
Additionally, generalised diameter-height equations are some¬ 
times used to generate individual tree height increment data 
(Hasenauer 1999; Kahn and Dursky 1999). 

The aim of the present paper is to develop and compare some 
generalised diameter-height equations for pure stands of T. un¬ 
dulata in arid region of Rajasthan State in India. 

Materials and methods 

Site description and data 

The geographic location of the study area ranged from 27°17' to 
28°31' across north latitude to 71°18'-72°51' east longitude. The 
area is characterized by large variation in the diurnal and sea¬ 
sonal temperatures. The mean monthly maximum temperature 
varies between 39.5°C and 42.5°C while mean monthly mini¬ 
mum temperature varies between 14°C and 16°C. The mean 
annual rainfall in the area varies from 120 mm to 300 mm. The 
majority of the rainfall is received during the southwest monsoon 
season (July-September). The number of rainy days varies from 
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8 to 17 days in the area. The mean monthly relative humidity in 
the area fluctuates largely during the year from 15% to 80%. 
Wind speeds as high as 130 km per hour have been experienced 
during the summer months. The terrain of the area is very undu¬ 
lating and is frequently subjected to moving sand dunes. Dust 
storms are also common in the region. The area consists of dry 
undulating planes of hard sand and gravely soil and rolling 
planes of loose sand. The soil is rich in potash but poor in nitro¬ 
gen and organic matter with very low productivity. There is a 
semi-consolidated lime concretionary or gypsum strata under¬ 
neath at many places. The soils are coarse textured with low 
water retention capacity. 

The data used in the present study were collected from the T. 
undulata plantations established by the State Forest Department 
of Rajasthan in the Indira Gandhi Canal Project area which is 
under drought prone hot arid region. The plantations available 
for the species under study cover age groups ranging from 14 to 
20 years and stand densities from 450 to 2 038 stems per hectare. 

Two different data sets were used, one for model fitting (re¬ 
ferred to as fitting data set) and another one for model validation 
(referred to as validation data set). The fitting data set consisted 
of 1 540 height-diameter measurements from 15 plots of 0.1 hm 2 
area installed in pure, uniformly stocked stands covering avail¬ 
able range of ages and stand densities. The validation data set 
had 854 diameter-height observations collected from 7 sample 
plots located in other plantations of T. undulata in the same area. 
The summary of some whole stand statistics for both the data 
sets are given in Table 1. 


Table 1. Summary of some whole stand statistics for the fitting and 
validating data sets 


Variable 

Minimum 

Maximum 

Mean 

Standard 





deviation 

Fitting data set 

Basal area (m 2 /hm 2 ) 

1.94 

14.21 

6.43 

3.92 

Dominant height (m) 

4.56 

8.47 

6.06 

1.18 

Stems per hectare 

450 

1916 

1116 

425 

Quad, mean dia. (cm) 

9.18 

17.77 

13.35 

2.79 

Validation data set 

Basal area (m 2 / hm 2 ) 

3.01 

10.63 

7.24 

2.81 

Dominant height (m) 

4.91 

8.54 

6.14 

1.34 

Stems per hectare 

517 

2038 

1263 

454 

Quad, mean dia. (cm) 

11.50 

20.20 

14.32 

3.11 


Fitted models 

The generalised diameter-height equations are different with the 
ordinary diameter-height equations in the sense that they include 
quadratic mean diameter, stand basal area or stems per hectare as 
extra independent variable so that the equation can be applied on 
the plantations available on different sites with varying stocking. 
Few such equations are available in the literature, which are 
modifications of Richards or Schumacher functions. This study 
compares predictive ability of four generalised diameter-height 
equations. 

Pienaar (1991) derived an equation from the Richards function 
(cp. Richards 1959) for stands of slash pine in the southeastern 
United States as the follows: 


h t =a l H 0 {L-a 2 e~ fl ‘ t,/Dt ) h (i) 

Mirkovich (1958) derived the following equation from the 
Schumacher function (cp. Schumacher 1939) for oak stands in 
central Europe (see also Michailoff 1943; Prodan et al. 1997): 

h t =1.3 + (o' + a 2 H 0 - cc 3 D g \~ p/d ‘ (2) 

The stands with the same Dg may have different stand densi¬ 
ties and hence model may be further improved by incorporating a 
variable, which accounts for stand density. Schroder and Gon¬ 
zalez (2001) modified equation 2 by incorporating stand basal 
area as an independent variable: 

h, = 1.3 + (qTj + a 2 H 0 - a } D g + (3) 

On the basis of the allometric growth theory, Gadow and Hui 
(1993) developed a generalised height regression for the stands 
of Cunninghamia lanceolata in southern China: 

h, = 1.3 + (4) 

In the above equations, /?,- is the height of tree i (m), dj the 
breast height diameter over bark of tree i (cm), H 0 the dominant 
stand height (m), D g the quadratic mean diameter of the stand 
(cm), G the stand basal area (m 2 /hm 2 ), a b a 2 , a 3 , a 4 , fS 2 are the 
parameters to be estimated, andl.3 is a constant used to avoid the 
prediction of a height less than 1.3 m when d\ is very small 

Each equation was applied to the fitting data set. As the equa¬ 
tions are intrinsically nonlinear, iterative nonlinear fitting is re¬ 
quired to estimate the parameters (cp. Draper and Smith 1981). 
In the present paper, the simplex algorithm provided in the 
non-linear estimation procedure of STATISTICA statistical 
software package (Statistica 1994) was used for parameter esti¬ 
mation. 

Model evaluation and validation 

The comparison of the four equations fitted was based on graphic 
and quantitative analysis of the residuals (e,j. Graphical analysis 
of residuals searching for discrepancies or patterns is an impor¬ 
tant step in evaluating the fitted models (Gadow and Hui 1999). 
Residuals were graphically examined to check for any trend. 
Linear regression of predicting on observed values was also done 
to see the performance of the fitted models. The ideal value for 
the intercept and slope of the linear regression is 0 and 1, respec¬ 
tively. Five statistical criteria were examined: bias {E ), which 
tests the systematic deviation of the model from the observa¬ 
tions; root mean squared error (RMSE), which measures the 
accuracy of the estimates; the adjusted coefficient of determina¬ 
tion (R 2 a rf ; j, which shows the proportion of the total variance that 
is explained by the model, adjusted for the number of model 
parameters and the number of observations; model precision 
(MPR), a standardised sum of squares criterion proposed by 
Freese (1960) for evaluating precision of fitted model; and 
Akaike’s information criterion differences (AlCd), which is an 
index to select the best model based on minimizing the Kull- 
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back-Liebler distance (Burnham and Andarson 1998). The ex¬ 
pressions for these criteria are as below: 


the final rank for each model that is indicative of its performance 
with respect to all the criteria considered. 


Bias: 


Results and discussion 


n 


(v, - V, ) 


1=1 


Root mean squared error: 


n (.. 

, ) 2 

y (> 

i y i) 


RMSE= 

n ~ p 

Adjusted coefficient of determination: 


R 2 adj=l- 


n~ 1 G’i-Ji) 2 


■E: 


n-P^iyi-Vi ) 2 

Model precision (ideal value 0): 

MPR = Z( v, - v, ) 2 


(5) 


( 6 ) 


(7) 


( 8 ) 


Akaike’s information criterion differences: 

AICd= n In a 2 + 21- min(/? In a 2 +2/) (9) 

where, y t , y t and y t are the measured, predicted and av¬ 
erage values of the dependent variable, respectively, a 2 is the 
variance of y, n the total number of observations used to fit the 

models,/* the number of model parameters, 1 — p + 1, and <7' 
the estimator of the error variance of the model, which the value 
is obtained as follows: 



( 10 ) 


As a thumb rule, models with AlCd <2 have substantial sup¬ 
port and should receive consideration in making inferences. 
Models having AlCd of about 4 to 7 have considerably less sup¬ 
port, while models with AICd>10 have either essentially no 
support and might be omitted from further consideration or at 
least those models fail to explain some substantial explainable 
variation in the data. 

All the four equations, with the same value of the coefficients 
obtained during the fitting phase, were applied on an independent 
data set consisting of 854 diameter-height observations for vali¬ 
dation purpose. The validation of each model was based on the 
analysis of the model efficiency (ME) calculated as the adjusted 
coefficient of determination, the bias, the root mean squared 
error of the estimates, model precision and Akaike’s information 
criterion differences. 

A rank was assigned to each equation based on each criterion 
(Cao et al. 1980). The ranks were then summed up to arrive at 


The values of model coefficients obtained by applying the four 
equations on the fitting data set are given in Table 2. The stan¬ 
dard errors for various regression coefficients are given in the 
parentheses, which showed that all the partial regression coeffi¬ 
cients were highly significant (p=0.0001) except the coefficients 
related to D g in equation 3 and G in Eq. (4). 


Table 2. Estimated coefficient values of different equations for fitting 
data set 


Equation 

ai 

a 2 

a 3 

04 

P 

Pi 

P2 

Eq. (1) 

1.254 

0.960 

-1.964 


2.073 




(0.065) 

(0.066) 

(0.133) 


(0.044) 



Eq. (2) 

2.954 

0.631 

0.020 


4.765 




(0.122) 

(0.039) 

(0.017) 


(0.068) 



Eq. (3) 

6.492 

1.310 

0.132 

0.018 

3.661 




(0.409) 

(0.087) 

(0.042) 

(0.025) 

(0.052) 



Eq. (4) 

0.412 

0.571 




0.436 

0.064 


(0.016) 

(0.117) 




(0.065) 

(0.005) 


Note: The values in brackets are standard errors for the coefficients 


Table 3 compares the fit statistics for the equations used and 
presents an overview of the performance of the equations based 
on the statistical criteria used for evaluating predictive ability of 
the models on fitting data set. It can be seen that adj. R 2 values 
were generally high (ranging from 0.887 for Eq. (1) to 0.920 for 
Eq. (4)) and acceptable for all the equations. The values of Bias 
and RMSE were the minimum for Eq. (4) while maximum for 
Eq. 1. Also, the first rank in model precision was Eq. (4), fol¬ 
lowed by Eq. 3. The values of Akaike’s criterion differences 
(AlCd) for fitting data set suggest that only Eq. (4) has substan¬ 
tial support in the model selection while all other equations failed 
badly in explaining some substantial explainable variation in the 
data and hence must not be given any consideration during 
model selection. The final rank showed that Eq. (4) ranked first 
while Eq. (1) ranked last. Thus, Eq. (4) placed first in the overall 
performance. 

Table 3. Statistical criteria for model evaluation on fitting data set 


Eqs. 

Adj. 

R 2 

Bias 

RMSE 

MPR 

AlCd 

Rank 

Final 

Rank 

Eq. 

0.887 

0.02983 

0.39839 

0.11352 

535.26530 

20 

4 

(1) 

(4) 

(4) 

(4) 

(4) 

(4) 



Eq. 

0.888 

0.01500 

0.39672 

0.11258 

522.38340 

15 

3 

(2) 

(3) 

(3) 

(3) 

(3) 

(3) 



Eq. 

0.913 

0.00680 

0.34920 

0.08722 

16.96423 

10 

2 

(3) 

(2) 

(2) 

(2) 

(2) 

(2) 



Eq. 

0.920 

-0.00011 

0.33483 

0.08019 

0.00000 

5 

1 

(4) 

(1) 

(1) 

(1) 

(1) 

(1) 




Values in the parentheses are the ranks 


Fig. 1 presents the plot of residuals against the actual values 
obtained from the fitting data set. As illustrated from the figure, 
Eq. (4) had the least dispersion of the residuals. The residual 
values varied from -1.05 to 1.20, -0.67 to 2.01, -0.58 to 1.37 and 
-0.90 to 0.97 for Eqs. (1), (2), (3) and (4), respectively. 
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Fig. 2 shows the plot of observed versus predicted values, 
which indicates that the Eq. (4) performed best and produced 
strongest correlation (R=0.959), followed by Eq. (3) (R=0.956). 
The value of the intercept (a) and slope ((3) of the linear regres¬ 
sion was -0.03063 and 1.00656, respectively. Slope and intercept 
of the straight line obtained were compared with the theoretical 


Equation (1) 


2.5 

2 - 

1.5 - 



i 

" z h, observed (m) 


Equation (3) 



-i - 
1.5 - 

-2 -1 h, observed (m) 


values of 1 and 0 by means of the joint-confidence ellipse E’-test 
to test the null hypothesis. A confidence ellipse is a 
2-dimensional interval in which, with a certain probability, the 
true parameter (a 2-dimensional vector) lies. At the confidence 
level of 95%, no evident difference was found. 


Equation (2) 


2.5 

2 - m 



-1.5 - 

-2 J h, observed (m) 


Equation (4) 

2.5 

2 - 

1.5 - 



-1.5 - 

-2 -1 h, observed (m) 


Fig. 1 Residuals for equations 1-4 plotted over observed values for fitting data set 




Fig. 2 Linear regression between observed and predicted values for equations 1-4 for fitting data set 
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For model validation, an independent data set (validation data 
set) was used to assess the predictive ability of different equa¬ 
tions. The equations (with the parameter values obtained from 
the fitting data set) were applied to the validation data set. Table 
4 compares the validation statistics for the four equations applied 
on the validation data set. 


Table 4. Validation statistics for generalised diameter-height equa¬ 
tions 


Equation 

Adj. 

R 2 

Bias 

RMSE 

MPR 

AlCd 

Rank 

Final 

Rank 

Eq. (i) 

0.811 

-0.09146 

0.47815 

0.18904 

262.10420 

13 

3 

(31 

(1) 

(3) 

(3) 

(3) 



Eq. (2) 

0.810 

-0.25237 

0.47945 

0.19007 

266.74360 

19 

4 

(4) 

(3) 

(4) 

(4) 

(4) 



Eq. (3) 

0.860 

-0.23572 

0.41116 

0.13978 

5.28459 

10 

2 

(2) 

(2) 

(2) 

(2) 

(2) 



Eq. (4) 

0.861 

-0.25851 

0.41012 

0.13908 

0.00000 

8 

1 

(1) 

(4) 

(1) 

(1) 

(1) 




Values in the parentheses are the ranks 


Equation 1 had minimum bias while Eq. (4) performed best in 
terms of model efficiency (adj. R 2 ), error (RMSE) and model 


precision (MPR). The values of Akaike’s criterion differences 
(AlCd) indicated that Eq. (4) had substantial support in the 
model selection while Eq. (3) had considerably less support. All 
other equations failed badly in explaining some substantial ex¬ 
plainable variation in the data and hence must not be given any 
consideration during model selection. The final ranking showed 
that Eq. (4) ranked first while Eq. (2) ranked last. Thus, Eq. (4) 
placed first in the overall performance. It may also be seen that 
Eq. (1) performing poorest during fitting process, occupied the 
third place in model validation phase replacing Eq. (2). 

Fig. 3 shows the plot of residuals against the actual values ob¬ 
tained from the validation data set. As illustrated from the figure, 
the Eq. (4) produced the best results and the least dispersion of 
the residuals while the dispersion was highest for Eq. (2). The 
residual values varied from -1.21 to 0.87, -0.84 to 2.20, -0.66 to 
1.35 and -0.62 to 1.02 for Eqs. (1), (2), (3) and (4), respectively. 
This result is in conformity with the ranking shown in Table 4. 
The Figure also indicated that for equations 2 & 3 the residuals 
increased with the increase of the tree height and hence these 
models failed to predict height of larger trees with greater accu¬ 
racy. 



Equation (1) 

2.5 - 


2.0 - 


1.5 - 


1.0 - 


.1 °- 5 ' 

'c/5 
<U 

o4 0 0- 

\'.y 

( 

* 3 ■«•»* 9 12 

-0.5 - 


-1.0 - 



♦ 

-1.5 - 

h, observed (m) 


2.5 - 

2.0 - 

1.5 - 

1.0 - 

1 0.5 - 

T3 

0.0 - 

Equation (2) 
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;/ 
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-1.5 - 

h, observed (m) 



Equation (3) 

2.5 - 


2.0 - 


1.5 - 



♦ 

1.0 - 

* y 

t/5 

"3 
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\ '* 

0) 
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-0.5 - 

ttSr&r 

-1.0 - 


-1.5 - 

h, observed (m) 



Fig. 3 Residuals for Eqs. (1)—(4) plotted over observed values for validation data set 

In the present study we prior to recommend use of Eq. (4) h = 1.3 + 0.70857 H 0 08300 d 0 43503 H «' 2 “ M 
based on the fit and validation statistics. The final equation based ^2 _ q _ q 34134 

on pooling the fitting and validating data set is given below: 
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Conclusions 

Four generalised diameter-height equations were tested for their 
perfonnance and the evaluation criteria for fitting data set 
showed that perfonnance of Eq. (4) was the best. Eq. (4) 
produced minimum value of mean residuals and root mean 
squared errors and model precision close to its ideal value. 
Finally, these equations were validated over an independent data 
set and again it was found that Eq. (4) perfonned best based on 
the statistical criteria. It was also noticed that Eq. (1) performed 
poorly during the fitting phase jumped to the third place in 
validation phase. Considering both the fitting as well as 
validating criteria, it was found that overall perfonnance of Eq. 
(4) was better compared with other equations used. Moreover, 
Eq. (4) needed data only for dominant height and tree diameters 
(less independent variable) to estimate tree heights while other 
equations needed more independent variables for prediction of 
tree height. Thus, Eq. (4) is easy to use and may be preferred. 
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